On the swap-distances of different realizations of a 
graphical degree sequence 



Peter L. Erdos* 1 ' 1 , Zoltan Kiraly b ' 2 , Istvan Mikl6s a ' 3 

a Alfred Renyi Institute, Redltanoda u 13-15 Budapest, 1053 Hungary 
email/ <erdos. peter, miklos.istvan> Qrenyi.mta.hu 
b Department of Computer Science and EGRES (MTA-ELTE), Eotvos University, 
Pdzmdny Peter setdny 1/C, Budapest, 1117 Hungary 
email: kiraly@cs.elte.hu 



Abstract 

One of the first graph theoretical problems which got serious attention (al- 
ready in the fifties of the last century) was to decide whether a given integer 
sequence is equal to the degree sequence of a simple graph (or it is graphical 
for short). One method to solve this problem is the greedy algorithm of Havel 
and Hakimi, which is based on the swap operation. Another, closely related 
question is to find a sequence of swap operations to transform one graphical 
realization into another one of the same degree sequence. This latter prob- 
lem got particular emphases in connection of rapidly mixing Markov chain 
approaches to sample uniformly all possible realizations of a given degree se- 
quence. (This becomes a matter of interest in connection of - among others 
- the study of large social networks.) Earlier there were only crude upper 
bounds on the shortest possible length of such swap sequences between two 
realizations. In this paper we develop formulae (Gallai-type identities) for 
these swap- distances of any two realizations of simple undirected or directed 
degree sequences. These identities improves considerably the known upper 
bounds on the swap-distances. 
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1. Introduction 

The comprehensive study of graphs (or more precisely the linear graphs, 
as it was called in that time) began sometimes in the late forties, through 
seminal works by P. Erdos, P. Turan, W.T. Tutte, T. Gallai and others. One 
problem which received considerable attention was the existence of certain 
subgraphs of a given graph G. Such a subgraph could be, for example, a 
perfect matching in a (not necessarily bipartite) graph, or a Hamiltonian 
cycle, etc. Generally these substructures are called factors. The first couple 
of important and rather general results of this kind were due to Tutte (in 
1952) who gave necessary and sufficient conditions for the existence of f- 
factors [18, 19]. 

In cases where G is a complete graph, the /-factor problem becomes 
easier: then we are simply interested in the existence of a graph with a 
given degree sequence, and at least two solutions of different kind were de- 
veloped around 1960. One was due to Havel [9] who constructed a famous 
greedy algorithm to answer this degree sequence problem. His algorithm 
was based on the notion of swap. It is interesting to mention the almost 
completely forgotten paper of Senior ([17]) who studied the problem of gen- 
erating graphs with multiple edges but without loops: his goal was to find 
possible molecules with given composition but with different structures. He 
also discovered the swap operation, but he called it transfusion. The other 
approach was the equally famous Erdos-Gallai theorem ([2]) which gave a 
necessary and sufficient condition in the form of a sequence of inequalities. 
In this latter paper Havel's method was an ingredient of the proof and the 
authors also observed that their result is a consequence of Tutte's /-factor 
theorem. 

In 1962 Hakimi studied the degree sequence problem in undirected graphs 
with multiple edges and loops ([7]). He developed an Erdos-Gallai type result 
for this much simpler case, and for the case of simple graphs he rediscovered 
the greedy algorithm of Havel. Since then this algorithm is referred to as 
the Havel-Hakimi algorithm. 

Already from the general /-factor theorem of Tutte one can derive a 
polynomial time algorithm to solve the degree sequence problem, but it was 
not done that time. Havel's algorithm provided a quadratic (in n) time 
construction method of the required graphical realization. 

The construction of - preferable "typical" - graphical realizations of given 
degree sequences became an important problem in the last two decades in 
connection of the emergence of huge networks in social sciences, medicine, 
biology or the internet technology, naming only some. 

Mentioning just one example here, data is collected from anonymous 
surveys in epidemics studies of sexually transmitted diseases, where the in- 
dividuals specify the number of different partners they have had in a given 
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period of time, without revealing their identity. In this case, epidemiolo- 
gists should construct typical contact graphs obeying the empirical degree 
sequence to estimate epidemiological parameters. 

To construct all possible realizations of a given degree sequence is typi- 
cally very time consuming task since usually there are exponentially many 
different realizations. Here we do not consider the computationally almost 
hopeless isomorphism problem. In this paper the vertices are labeled (there- 
fore distinguishable) and two isomorphic realizations where the isomorphism 
changes the labels are considered to be different ones. 

The methodology to construct all possible realizations is already not self- 
evident: for example the Havel-Hakimi algorithm is not strong enough to 
find all of them. It is also important that no particular realization should be 
outputted more than once, and, finally, that the waiting times between two 
consecutive outputs should not be too long. These concerns were addressed 
in [11]. 

However when our goal is to find a "typical" realization but there are ex- 
ponentially many different ones then generating all of them and choose one 
realization randomly is simple not feasible. One way to overcome this hard- 
ness is to construct a good Monte Carlo Markov Chain (MCMC) method. 
To that end we need a particular operation to walk on the space of the dif- 
ferent realizations; this operation is called a swap, and it is essentially the 
same as the operation in Havel's algorithm: we choose four vertices, where 
in the induced subgraph there is a one- factor, while an other one-factor is 
missing, and we exchange the existing one-factor into the missing one. This 
clearly preserves the degree sequence. 

It is interesting to recognize, that - as one can learn this fact from Erdos 
and Gallai - the swap operation for this purpose was originally discovered 
by Petersen, already in 1891 ([15]). 

The problem of Erdos-Gallai type characterization of bipartite degree 
sequences were studied already in the mid-fifties by Gale (in [5]) using net- 
work flow techniques. In the same year Ryser gave a direct proof for this 
characterization, using a matrix theoretical language ([16]) and showed that 
any particular realization of a bipartite degree sequence can be transformed, 
using sequence of swaps, into any other realization. Both results were for- 
mulated on the language of directed graphs without multiple edges but with 
possible loops. (For the connection between bipartite and directed degree 
sequence problems see Section 5.) The corresponding result for simple di- 
rected graphs (no loops, no multiple edges) is due to Fulkerson ([4]). 

Havel-Hakimi type results are part of the folklore in connection with 
bipartite degree sequences but it is hard to find a definitive reference for 
it (but book [20] discusses the problem in Exercise 1.4.32 and paper [10] 
provides one proof by-product). 

In case of simple directed graphs Havel-Hakimi type results were proved 
by Kleitman and Wang ([13]) for an extension given by Kundu [12]. Swap 
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sequences between realizations of directed graphs were rediscovered in [3]. 
The situation here is more complicated than in the previous cases: using 
only two edges for a swap is not always enough, sometimes we have to 
use three-edge swaps. Moreover there are two different kinds of three-edge 
swaps: in the type 1 swap the three edges form an oriented C3 and the 
result of the swap is the oppositely oriented triangle. In the type 2 swap 
the three involved directed edges determine four vertices (see [13, 3]). 

In [3] a weak upper bound was proved for the swap-distance of two 
realizations of directed degree sequences. In the proof all three types of 
swaps were applied, and counted as one. However, as LaMar proved recently 
(in [14]), swap sequences between any two realizations may omit completely 
type 2 triple-swaps. In Section 5 we will strengthen this result (see Theorem 
5.5). Finally Greenhill proved ([6]) that in case of regular directed degree 
sequences (when all in-degrees and out-degrees are the same) all triple-swaps 
can be omitted (if n > 3). The reason for the need for some type 2 swaps 
in [13, 3] was simple: in the Havel-Hakimi situation (by analogy) it was 
sought for one swap changing a particular directed edge to a particular non- 
edge. However, when a transformation sequence is looked for, then this is 
not a requirement. As it turned out, type 2 triple-swaps are so called non- 
triangular C6-swaps (see Remark 5.1) and can be substituted by two regular 
swaps. 

These problems have long and lively history but we do not survey that 
here. We just want to point out that knowing the maximum length of the 
necessary swap sequences can lead to better estimations for the mixing time 
of Markov chains using swap operations. Until now there were only weak 
upper bounds on those lengths: they are surely shorter, than twice the num- 
ber of edges in the realizations (which is equal to the sum of the values in the 
degrees sequence). This applies for simple directed or undirected graphical 
degree sequences (including bipartite ones as well). (See for example [3]). 

The main goal of this paper is to determine a formula for the swap- 
distance (that is the length of the possible shortest swap sequence) between 
any two particular realizations G\ and G2 • Here we will prove a Gallai-type 
identity 

dist( Gl ,G 2 ) = l^i)A^G 2 )| _ maxC(Gi;G2); (L1) 

where A denotes the symmetric difference and maxC is a positive integer 
which depends on the realizations. In case of directed degree sequences 
triple-swaps should count as 2 in the swap-distance. (For an explanation, see 
the definitions after Lemma 5.3.) However, as it will turn out, we only need 
type 1 triple-swaps. We can forbid type 2 triple-swaps while the equation 
does not change. 

It is very important to understand that while the right side of equation 
(1.1) can be interpreted indeed as "the exact value of the swap-distance" - 
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actually maxC is possibly (probably) not an efficiently computable value. 
We think that the right goal here is to find good estimations for maxC. We 
made the first steps into this direction, see Theorems 3.8, 4.1 and 5.6. 

The structure of the paper is the following: in Section 2 we introduce 
the definitions and recall some known facts and algorithms. In Section 3 
we prove (1.1) for undirected degree sequences. In the very short Section 
4 we describe the consequences of the previous results for bipartite degree 
sequences. Finally in Section 5 we discuss the problem for directed degree 
sequences based on further considerations on realizations of bipartite degree 
sequences. 

2. Definitions, notations 

Throughout the paper G denotes an undirected simple graph with vertex 
set V(G) = {v\,V2, ■ ■ ■ ,v n } and edge set E(G). Consider a sequence of 
positive integers d = (di, di, ■ ■ ■ , d n ). If there is a simple graph G with 
degree sequence d, i.e., where for each i we have d(vi) = di, then we call the 
sequence d a graphical sequence and in this case we also say that G realizes 
d. 

The analogous notions for bipartite graphs are the following: if B is a 
simple bipartite graph then its vertex classes will be denoted by U(B) = 
{ui, . . . , Uk} and W(B) = {w\, . . . , wg}, and we keep the notation V(B) = 
U(B) U W(B). The bipartite degree sequence of B, hd(B) is defined as 
follows: 

bd(B) = ((d(ui), • • • , d(u fc )) , (d( Wl ), . . . , d(w e ))Y 

Let G be a simple graph and assume that a, b, c and d are different 
vertices. If G is bipartite graph B then we also require that for a, b G 
U(B), c,d € W(B). Furthermore assume that (a,c),(b,d) G E(G) while 
(b,c),(a,d) £E{G). Then 

E(G') = E(G) \ {(a, c), (b, d)} U {(b, c), (a, d)} (2.1) 

is another realization of the same degree sequence (and if G is a bipartite 
graph then G' remains bipartite). The operation described above is called a 
swap. This operation is used in the Havel-Hakimi algorithm, and Petersen 
proved [15] - and several authors later reproved - that any realization of a 
degree sequence can be transformed into any another realization of the same 
degree sequence using only consecutive swap operations. 

As throughout the paper all graphs will be simple, from this point we 
will omit the word "simple" . 

A graph G, where the edges are colored by either red or blue, will be 
called a red-blue graph. For vertex v denote by d r (v) and db(v) the degree 



5 



of vertex v in red and blue edges, resp. This red-blue graph is balanced if 
for each v € V(G) we have d r (v) = db(v). 

A circuit in a graph G is a closed trail (each edge can be used at most 
once). As the graph is simple, a circuit is determined by the sequence of the 
vertices vq, . . . , v t , where vq = v t . Note that there can also be other indices 
i < j such that Vi = Vj. A circuit is called a cycle, if its simple, i.e., for any 
i < 3, vi = Vj only if i = and j = t. 

A circuit (or a cycle) in a balanced red-blue graph is called alternating, 
if the color of its edges alternates (i.e., the color of the edge from Vi to Vi+i 
differs from the color of the edge from Vi+i to fj+2, and also edges vqv± and 
vt-iVt have different colors - consequently alternating circuits have even 
length) . 

By Euler's famous method one can easily prove the following 

Proposition 2.1. If G is a balanced red-blue graph then the edge set can 
be decomposed to alternating circuits. If B is a bipartite balanced red-blue 
graph then the edge set can be decomposed into alternating cycles. 

If two graphs, G\ and G2 are different realizations of the same degree se- 
quence, then we associate with them the following balanced red-blue graph. 
The vertex set is V{G\) = V(G2) and the edge set is the symmetric differ- 
ence E(Gi)AE(G2). An edge is colored red, if it is in E(G\) - E(G2), and 
it is colored blue, if it is in E(G2) — E(G\). 

Definition 2.2. If G is a balanced red-blue graph then let maxC u (G) de- 
note the number of the circuits in a maximum size (= maximum cardinality) 
alternating circuit decomposition of G. If G\ and G2 are two realizations of 
the same degree sequence then let maxC M (Gi, G2) = maxC u (G), where G 
is the associated balanced red-blue graph. 

Definition 2.3. Let G\ and G2 be two given realizations of d. Denote by 
dist u (Gi, G2) the length of the shortest swap sequence from G\ to G2. 

A pair of vertices u and v will be called a chord, if it can hold an edge. 
That is for non-bipartite graphs uv is a chord if and only if it 7^ v, but for 
a bipartite graph B, uv is a chord if and only if u € U(B) and v G W(B) 
or vice versa. If a circuit C = vq . . . , v t is given and v%Vj is a chord, then we 
will also call the pair ij of subscripts a chord. 

For directed graphs we consider the following definitions: Let G de- 
note a directed graph (no parallel edges, no loops) with vertex set X(G) = 
{xi,X2, ■ ■ ■ , x n } and edge set E(G). We use the bi-sequence 

dd(G) = ( (df, 4, . . . , d+) , (dr, d 2 , . . . , d-) ) 

to denote the degree sequence, where df denotes the out-degree of vertex 
Xi while d~ denotes its in-degree. A bi-sequence of non-negative integers is 
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called a directed degree sequence if there exists a directed graph G such that 
(d + ,d~) = dd(G). In this case we say that G realizes our directed degree 
sequence. 

A directed graph G is a balanced red-blue graph, if for every vertex the 
red in-degree is the same as the blue in-degree, and moreover the red out- 
degree is the same as the blue out-degree. Thus if G\ and G2 are different 
realizations of the same directed degree sequence then the associated red- 
blue graph (defined similarly as for the undirected case) is a balanced red- 
blue graph. 

The definition of alternating circuit differs from the one defined for undi- 
rected graphs as follows. A circuit vq, . . . , vt in a balanced red-blue graph G 
is alternating, if both the colors and the directions alternates (e.g., if ViVi+\ 
is a red directed edge then 1^+2 Vj+i is a blue directed edge). 

Again, by Euler's method one can easily prove the following: 

Proposition 2.4. If G is a balanced red-blue graph then the edge set can be 
decomposed into alternating circuits. 

Definition 2.5. Assume that G is a directed balanced red-blue graph and 
let maxC f i(G) denote the number of the circuits in a maximum size alter- 
nating circuit decomposition of G. If G\ and G2 are two realizations of 
the same directed degree sequence then let maxC^Gi, G2) = maxCd(G), 
where G is the associated balanced red-blue graph. 

For directed graphs we use the old trick, applied already by Gale [5] : each 
directed graph G can be represented by a bipartite graph B(G), where each 
class consists of one copy of every vertex. The edges adjacent to a vertex 
u x in class U represent the out-edges from x, while the edges adjacent to a 
vertex w x in class W represent the in-edges to x (so a directed edge xy is 
identified with the edge u x w y ). Note that the directed degree sequence of 
G is the same as the bipartite degree sequence of B(G). If G is a directed 
balanced red-blue graph then naturally we get B(G) as a balanced red-blue 
graph, and the alternating circuits of G corresponds to the alternating cycles 
of B(G). For an alternating circuit C of G we denote the corresponding 
alternating cycle of B[G) by C. 

As loops are not allowed in G, edges of the form u x w x are also forbidden 
in B(G), so they will be called non-chords. 

For two graphs G\ and G2 (or bipartite graphs or directed graphs) with 
the same degree sequence (or bipartite degree sequence or directed degree 
sequence, resp.) we will use H'{G\,G2) for the halved Hamming distance 
\E(Gi)AE(G2)\ ^ Note ^at H'(G\,G2) is the same as the number of red (or 
blue) edges in the associated balanced red-blue graph G. 

3. Undirected degree sequences 

In this Section, we prove equality (1.1) for undirected degree sequences. 



7 



Lemma 3.1. Let C = vq,v\, . . .V2t = vq be an alternating circuit in a 
balanced red-blue graph G, in which for some i < j < 2t, j — i is even and 
Vi = Vj. Then the circuit can be decomposed into two shorter alternating 
circuits. 

Proof. Since both Vi, . . . v j and Vj,Vj + ±, . . . V2t, V\,...Vi contains even 
number of edges, both of them form alternating circuits. □ 

Definition 3.2. We call an alternating circuit C = vq, v±, . . . vit elemen- 
tary, if (i) no vertex appears more than twice in it, and if (ii) there exists 
an integer < i < 2t, such that both vertices V{ and Vi+i occur only once 
in the circuit. 

Lemma 3.3. Let C±, . . . , be a maximum size alternating circuit decom- 
position of a balanced red-blue graph G {that is h = maxC u (G)). Then each 
circuit is elementary. 

Proof, (i) First assume that a circuit C z = VQ,...,V2t contains the ver- 
tex v three times. Then two of the occurrences have the same parity, and 
Lemma 3.1 applies. But this contradicts to the maximality. Therefore any 
vertex in any circuit of a maximum size decomposition occurs at most twice, 
and the two subscripts of each repeated vertex (within any circuit) have dif- 
ferent parities. We call a pair < i < j < 2t a non-chord, if Vi = Vj. The 
length of a non-chord ij is defined to be min(|i — j\, 2t — \i — j\). We proved 
that each index i is a part of at most one non-chord, and the length of any 
non-chord is odd. 

We next prove that non-chords cannot intersect. More precisely, if < 
i<k<j<£<2t then, if ij is a non-chord then kl is a chord. Let C' be 
the following alternating circuit: 

V ,...,Vi= Vj,Vj-l,Vj- 2 , ■ ■ ■ ,V k ,. ..V i+ l,Vi = Vj,Vj+l, ...,V£. ..V 2t - 

In this new circuit (which consists of the same edges as the original circuit) 
the new index of vertex v k is k', where k — k' is odd. Therefore if k — I was 
odd then k' — t is even. Thus for this circuit Lemma 3.1 applies - which 
in turn shows that the original circuit decomposition is not a maximum size 
one, a contradiction. 

(ii) By re-indexing the vertices of the circuit we may assume that Ok is 
the shortest non-chord of C z . Since G is simple we have k > 2. Consequently 
indices 1 and 2 cannot participate in any non-chord otherwise they would 
induce crossing non-chords. □ 

From the middle part of the proof one can deduce a much stronger 
statement. Given a circuit C = VQ,...,V2t we call a vertex unique, if it 
appears exactly once in that circuit. 
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Theorem 3.4. Let C\,...,Ch be a maximum size alternating circuit de- 
composition of a balanced red-blue graph G. Then 



(i) each circuit C z = vo, . . . , v<it contains at least 2t/3 + 2 unique vertices; 

(ii) the length of each circuit in the decomposition is at most (3/2)(n — 1), 
consequently 

maxC „ (G ) > \m^r 

3n 

Proof, (i) Let /x denote the number of unique vertices of the circuit and let 
v denote the number of non-unique vertices (not indices). Since no vertex 
appears more than twice in the circuit therefore 2t = n + 2i> and the number 
of non-chords is exactly v. 

lit = 2 then non-chords do not exist so nothing to prove. Assume now 
that t > 2 and consider the following planar graph P with 2t vertices. First 
we draw a convex 2t-gon on the plane with vertices po,... ,P2t-i- Next for 
each non-chord ij we connect pi to pj by a straight line segment. The proof 
of Lemma 3.3 shows that there are no crossing non-chords therefore this is 
a planar embedding. 

Now we take the planar dual P* and delete the vertex of the dual corre- 
sponding to the infinite face of P. We call the resulting graph T. 

It is easy to see that T is a tree. Indeed, we can argue by contradiction. 
If T contains a cycle then the original planar graph contains a vertex v 
within this cycle. But each original vertex of the graph is neighboring to 
the ocean. Therefore vertex v is also next to the ocean, so the dual vertex 
O corresponding the ocean must not be outside C. But that would imply 
that O belongs to the cycle, a contradiction. 

The edges of T correspond to the non-chords, so \E(T)\ = \V(T)\ — 1 = v. 
The vertices of the tree correspond to the finite faces of P. We claim that 
if v £ V(T) has degree at most 2 in the tree, then the corresponding face 
contains at least 2 unique vertices. If a face is adjacent to one non-chord (it 
corresponds to a leaf in T) then it has at least two unique vertices since G 
is simple. Suppose there is a face of P adjacent to two non-chords, ij and 
i'j', where we may assume that i < i' < f < j. As G is simple and the 
non-chords have odd length, we can conclude that i' — i + j — j' > 4, proving 
the claim. 

Let n<d (and n>d) denote the number of vertices of T having degree at 
most d (at least d, resp.). We prove by induction on |V(T)| that if |V(T)| > 1 
then n<i > n>3 + 2 (if we delete a leaf then either none of n<i and n>3 
is changed, or n<\ decreases by one and n>3 decreases by at most one). 
Consequently n<2 > n>3 + 2. So 

fi > 2n< 2 > \V(T)\ + 2 = \E{T)\ + 3 = ^ + 3. 
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As 2t = [i + 2v, we have 2t < 2>n — 6, consequently \x > 2t/3 + 2, proving 
our first statement. 

(ii) To prove the second statement we only need a simple calculation. 

t + (t/3 + 1) < fi/2 + v + n/2 = n + v < n, 
so really 2t < (3/2)(n - 1). □ 

Now we are ready to analyze the minimum size swap sequences. We start 
with the simplest case: 

Lemma 3.5. Assume that G\ and G2 are two realizations of the same de- 
gree sequence, and G is a balanced red-blue graph consisting of the edges in 
E(Gi)AE(G2)- Suppose that E(G) is one alternating elementary circuit C 
of length 2t. Then there is a swap sequence of length t — 1 between G\ and 
G 2 . 

Proof. Let us call G\ the start and G2 the stop graph. We apply induction on 
the size of the symmetric difference \C\ of the actual start and stop graphs. 
We may assume, that vq occurs exactly once in C and the current voV\ edge 
belongs to the start graph (since this circuit is elementary, due to Lemma 
3.3, we can always renumbering the vertices accordingly). 

When t = 2 then the statement is clear, since C is an alternating 
cycle of length four, so assume now that t > 2. Consider the chords 
i>ot>i, v±v 2 , V2V3, V3V0. (Let's recall: by definition we have voVi,v 2 Vs £ 
E(Gi) \ E(G 2 ) and v x v 2 € E(G 2 ) \ E{G X ) while v 3 v £ E(G 1 )AE(G 2 ).) 

When chord V3V0 is non-edge in the start (and therefore in the stop) 
graph, then we can perform the vqVi,v 2 V3 v±v 2 ,V3Vo swap in the start 
graph. After this operation the circuit will be shorter by two edges and 
remains elementary. So we can apply the inductive hypothesis for the new 
start/stop graph pair. If, however, the chord V3V0 is an edge both in the 
start and stop graphs, then we can carry out the v±v 2 ,V3Vo => vqVi,v 2 V3 
swap in the stop graph, still maintaining all the necessary properties. So we 
can proceed with the induction on the new start/stop graph pair. □ 

Theorem 3.6. For all pairs of realizations G\,G 2 of the same degree se- 
quence, we have 

dist„(Gi, G 2 ) = H'(G U G 2 ) - maxC„(Gi, G 2 ). (3.1) 

Proof, (i/a) The inequality LHS < RHS is a simple application of Lemma 
3.5: take a maximal alternating circuit decomposition Ci,...,Ck where k = 
maxC u (Gi, G 2 ), and define realizations G\ = Hq, H±, . . . , = G 2 

such that for alH = 0, . . . , k — 1 realizations Hi and Hi + \ differ exactly in 
Cj. Then by Lemma 3.3 all circuits are elementary, so the application of 
Lemma 3.5 for each pair Hi, Hi + \ proves this inequality. 
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(i/b) One can find a recursive proof as well. This is based on the following 
easy observation: assume that the shortest circuit C\ in the previous maxi- 
mal decomposition has the shortest length among all circuits in all possible 
maximal circuit decomposition. Then 

Lemma 3.7. There exists no edge in any of the other circuits which would 
divide C\ into two odd length trails. 

Proof. Assume the opposite: the chord v\,V2t of C\ is an edge in G 2 . Then 
this edge together with one of the trails of C\ form a shorter circuit, than C\ 
while the other trail together with the remaining part of Ci form another 
alternating circuit. So we constructed another circuit decomposition with 
the same number of circuits, but with a shorter shortest circuit, a contra- 
diction. (It is still possible that a chord in C\ belongs to another circuit as 
well - but this divides C\ into two even-length trails. However this will not 
cause any problem.) [H3.7 

Now we can operate as follows: consider the (actual) symmetric differ- 
ence, find a maximal circuit decomposition with a shortest elementary cir- 
cuit. Apply the procedure in Lemma 3.5 for this circuit (by Lemma 3.7 we 
can do it). Repeat the whole process with the new (and smaller) symmetric 
difference. 

(ii) We finish the proof of Theorem 3.6 by proving that LHS > RHS. We 
realign (3.1) into 

maxC u (Gi,G 2 ) > H'(Gi,G 2 ) - dist„(Gi, G 2 ). 

Assume that the sequence G\ = Hq, Hi, . . . , Hk-i, Hk = G 2 describe a 
minimum length realization sequence from G\ to G2 where for each i = 
0, . . . , k — 1, the graphs Hi and i^+i are in swap-distance 1. It is clear that 
any consecutive swap subsequence from Hi to Hj must be also a minimum 
one. For each i we use the notation 

Aj := E 1 AE(H i ). 

We are going to construct a circuit decomposition of E1AE2 = into 
> H'(Gi,G2) — dist u (Gi,G2) alternating circuits. By part (i) it will prove 
also that the two sides are actually equal (otherwise the swap sequence 
cannot be minimum). We proceed with induction: we will show that for all 
i = 0, . . . , k we have 

maxC u (Gi,ffi) > H'(G U H t ) - dUst^d, Hi). (3.2) 

In case of i = this is clearly true, if i = k then the main statement is 
proved. Now we assume (3.2) for subscript i and we are going to prove it 



11 



for i + 1. By the hypothesis we know that dist u (G\, Hi) = i. We are going 
to distinguish cases upon the relations among E(Hi)AE(Hi+i) = S and Aj. 

Assume at first that \S fl Aj| = 0. Then the number of circuits in the 
decomposition of Aj+i is increased by one (comparing to the maximum 
decomposition of Aj), the number of edges is increased by four, finally the 
number of swaps is increased by one again. Inequality (3.2) is maintained. 

Now assume that \S fl Aj| = £ > 0. Since S is derived from the swap 
transforming Hi into Hi + \ therefore the two existing edges among the four 
chords defining S are edges in Hi and not edges in -ffj+i and the analogous 
statement is true for the two missing edges. Therefore the chords in S n Aj 
are in the same states in Hq and in Hi + \. Then 

(a) if I = 1 then this chord does not belong to Aj + i therefore the other 
three chords of S extend the original circuit. Therefore the number of 
circuits is the same as before, while |Aj+i| = |Aj| + 2 and the number 
of necessary swaps is increased by one. Inequality (3.2) is maintained; 

(b) if t > 1 then the £ common chords can be in at most £ circuits. It can 
happen, that some circuits meld into a smaller number of circuits after 
the swap on S is performed, but this can decrease the number of circuits 
with at most Furthermore |Aj+i| = |Aj|+4— 2£, finally the number 
of necessary swaps increased by one. Inequality (3.2) is maintained. 

The proof Theorem 3.6 is finished. □ 

As it was already mentioned the value maxC u seems to be not efficiently 
computable. Therefore Theorem 3.6 does not help directly to find a shortest 
swap sequence between two particular realizations. However good upper and 
lower bounds on this value may be useful. It is clear, however, that these 
bounds depend not only on the number of the edges (which is \E\) in one 
realization but on the size of the symmetric difference. When \E\ is small, 
say \E\ < 7^(2) then the size of the symmetric difference can be as big as 
2\E\. If \E\ is much higher then the symmetric difference becomes small. 

Assume now, that the graphical degree sequence under investigation is 
(1, 1,1,..., 1). All realizations are perfect matchings, and if two of them form 
one alternating Eulerian cycle, then the actual swap-distance, by Theorem 
3.6 is \E\ — 1. This is just the half of the old estimation. The consequence 
is that the diameter of the corresponding Markov chain can be as big as 
\E\ - 1. 

Next we give a general bound on the swap-distance (which is in some 
sense sharp), and then we formulate some conjectures. For a given degree 
sequence d = {g?i,g?2, . . . ,d n } let m denote (Y^di)/2, the number of edges 
in any realization, and let m* denote ^min(dj,n — di), an upper bound 
on the number of edges in a balanced red-blue graph associated with two 
realizations G\ and G2. 
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Theorem 3.8. For all pairs of realizations G±,G 2 of the same degree se- 
quence of length n, we have 

dist u (Gi,G 2 ) < H'{G u G 2 )-(l-^\ 

< m* (-- — )< ml 1 -—} . 
\2 3nJ ~ V 3n J 

Proof. It is a simple calculation using Theorem 3.4 and the simple fact, that 
H'(G 1 ,G 2 ) < m*/2 < m. □ 

Conjecture 1. Let G be a balanced red-blue graph with n vertices and m 
edges. Then (i) there exists an alternating circuit of length at most 3n 2 /m. 
And (ii) maxC u (G) > m 2 /(6n 2 ). 

Such upper bound would provide a lower bound on the distance, and thus 
could be useful in practical applications. 

Conjecture 2. For a degree sequence d = {d±, d 2 , ■ ■ ■ , d n } let m again 
denote (J2di)/2, ^ e number of edges in any realization, and let m* denote 
^min(dj,n — di). Then we conjecture the following statements. The listed 
inequalities arisen 

(i) dist„(Gi, G 2 ) < H\G U G 2 ) ■ (1 - m/(3n 2 )). 

(ii) dist„(Gi,G 2 ) < m*(l/2 - m/(6n 2 )). 

(iii) dist„(Gi,G 2 ) < m(l-m/(3n 2 )). 

(iv) dist„(Gi,G 2 ) < 5n 2 /24. 

4. Undirected bipartite degree sequences 

It is easy to see that for bipartite degree sequences Theorem 3.6 applies 
without any changes (note that in the proof we only used chords of odd 
length, so they are also chords in the bipartite case). Even more, since there 
is no odd cycle in a bipartite graph, the circuits in the maximal size alter- 
nating circuit decomposition of the symmetric difference of two realizations 
are cycles. As a consequence, for two realizations B\ and B 2 of a bipartite 
degree sequence, we can interpret maxC u (Bi, B 2 ) as the maximum number 
of cycles in a decomposition into alternating cycles (which always exists) of 
the associated balanced red-blue graph B. However, we think that even for 
bipartite realizations the determination of maxC u might be hard. 

Let bd = ((cti, . . . , afc), . . . , be a given bipartite degree sequence, 
we assume i < k. Let n = k + £, n' = 2£, m = Yl a i> an d let m* denote 
2^min(oj,£ — a^), an upper bound on the number of edges in a balanced 
red-blue graph associated with two realizations B\ and B 2 . Using that any 
alternating cycle has length at most n', similarly to Theorem 3.8, we get the 
following. 
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Theorem 4.1. For all pairs of realizations B\,B<i of the same degree se- 
quence bd we have 

dist u (Bi,B 2 ) < H'(B 1 ,B 2 )-(l-2/n') 

< m*(l/2 - l/n') < m(l - 2/n')- D 

5. Directed degree sequences 

In this section we discuss directed degree sequences. We will apply the 
machinery of Section 4 to solve the directed degree sequence problem, using 
the bipartite graph B(G) defined in Section 2. However doing so we may 
face a serious problem: since no loop is allowed in G, we cannot use edges 
of form u x w x in the process. Recall that these pairs are called non-chords. 
So at first we are going to analyze the alternating cycles we have to handle 
along the process. 

Let G be a directed balanced red-blue graph associated with two real- 
izations G\ and G 2 of the same directed degree sequence, let B = B{G) be 
the corresponding bipartite balanced red-blue graph, and let C be an al- 
ternating circuit in G (recall, that C denotes the corresponding alternating 
cycle in B). In this section we will mainly use the terminology about the 
bipartite representation B, but, where it is interesting, we remark in italics 
and in parenthesis the corresponding notions in the original directed graph. 

Case 1: Let us start with the case, when there exists a vertex u x in the 
cycle C such that w x is not contained in C (of course, by symmetry, the case 
when w x is in C, but u x is not, can be handled by the same way). (This is 
equivalent to saying that circuit C contains x only once.) We can work with 
u x at each step of the process described in Lemma 3.5: we take the trail 
of length 3 starting at u x , and interchange the start and stop graphs if the 
first edge belongs to the stop graph (we will do this step again and again 
as a routine, observe that if we are given a swap sequence from one graph 
to another graph, then the reverse sequence transforms the second graph to 
the first one). And at every step the vertex u x remain in the cycle and w x 
will not become a vertex of the cycle. 

Case 2: Next assume that for each vertex u x € C we also have w x € C, but 
also assume that we have a vertex u x in C, such that the trail of length 3 
(along the ordering of the cycle) starting at u x does not end at w x . Then we 
can use vertex u x at the process as before. After the first swap, there will 
be two vertices which will occur without their non-chord pairs in the new 
cycle. So we are back to Case 1. 

Case 3: Finally assume that neither Case 1 nor Case 2 applies. We can 
handle this case as follows. Assume first that C is long enough, that is 
|C| > 8. As every vertex participates in one non-chord, one of the two 
different vertices that are of distance 3 (along the cycle) from any fixed 
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vertex u x , must differ from w x . So we are back to Case 2 after reversing the 
description of C and possibly interchanging the start and stop graphs. 
From now on we will call these "usual" swaps as C4- swaps. 

When our cycle is of length 6, then no such trick works. (In G\ we have 
an oriented triangle and G2 is identical with G\ except that it contains the 
other orientation of the same triangle. Then we have to use a new type of 
swap: we exchange the first oriented triangle to the second one.) This means 
that in the bipartite graph we swap a Cq with 3 non-chords in one step. For 
obvious reason we will call this new swap as triangular C6-swap and the 
cycle itself is called a triangular Cq cycle. 

Remark 5.1. We can describe now the type 2 triple-swaps mentioned in 
Section 1 and introduced in [13, 10]: they are simply non-triangular Cq- 
swaps that can be implemented by two C4-swaps. 

Lemma 5.2. If C is a cycle in the decomposition that is not a triangular Cq 
cycle, then we can always perform the next swap without producing a new 
triangular Cq cycle. 

Proof. This is clearly the case when we are in Case 1. If we are in Case 3 or 
in Case 2, then |C| > 8 and after the first swap two neighboring vertices are 
deleted from the cycle, resulting that we are back to Case 1 (two non-chords 
disappear). □ 

It is important to recognize that sometimes triangular C^-swaps are ab- 
solutely necessary: for example let n > 2 be an integer and consider the 
following n + 1-element directed degree sequence: dd = ((n,n, . . . ,n,n — 
1, n — 1, n — 1); (n, n, . . . , n, n — 1, n — 1, n — 1)) . It is clear that there are 
exactly two different realizations of this directed degree sequence: namely 
this is a complete directed graph on n + 1 vertices minus one oriented trian- 
gle - and this oriented triangle can be of two different kinds. And for these 
realizations there are exactly one possible swap: the triangular C^-swap on 
that six vertices of the B(Gi) realizations. (The simplest such example is 
((1,1,1), (1,1,1)).) 

With these observations we just proved, that any realization of a directed 
degree sequence can be transferred to any other realization of the same 
degree sequence using only C4- and triangular C^-swaps. Therefore from 
now on - opposing papers [13] and [10] - we allow only these two types of 
swaps, while three-edge swaps of type 2 are not allowed anymore. 

Next we are going to analyze the structure of the triangular Cq cycles in a 
maximal cycle decomposition with minimal number of triangular Cq cycles. 
(Let us start with an example (see Figure 1): in this directed degree sequence 
the realizations consists of two oriented triangles, sharing one vertex.) 
Figure 2 shows the bipartite representation of the symmetric difference of the 
corresponding B\ and 1?2- It is easy to see that there are two possible cycle 
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Let the vertices be X = {a, b, c, d, e} and dd = ((2, 1, 1, 1, 1); (2, 1, 1, 1, 1)) 




(a) Realization G\ (b) Realization G2 



Figure 1: Two realizations 

decompositions of this symmetric difference: one consists of two triangular 
Cq cycles, but the other one contains none (see Figure 3). 




Figure 2: The bipartite representation of the symmetric difference 

It is a fortune that this is the typical behavior. We say that two cycles in 
the decomposition of the bipartite representation are kissing, if there exists 
a vertex x 6 X such that both alternating cycles in the decomposition 
contain both u x and w x . If one or both kissing cycles are triangular C% 
cycles then we can transform these two cycles into a new decomposition, 
without any triangular Cg. For that end we consider the four trails defined 
by u x and w x and pair them up in the right way. 

With this observation we just proved the following structural property: 

Lemma 5.3. Assume that the alternating cycle decomposition C of the sym- 
metric difference of B(G\) and B(G2) is a maximal one with minimum num- 
ber of triangular Cq cycles. Then no triangular C§ cycle kisses any other 
cycle. 

We are ready now to define the swap-distance of two arbitrary realizations 
of the same directed degree sequence. We consider a weighted swap- 
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(a) Decomposition into triangular Ces 



(b) and into non-triangular Cgs 



Figure 3: Two possible cycle decompositions 

distance: an ordinary G^-swap weighs one, but a triangular Gg-swap weighs 
two. (This convention is well supported with the fact that two kissing trian- 
gular cycles can be transformed into two ordinary, length 6 cycles, each of 
them transformable using two G^-swaps.) So distd(Gi, G 2 ) denotes the min- 
imum total weight of a swap sequence transforming G\ to G2 ■ The definition 
of maxCd(Gi, G2) is analogous to the undirected case: this is the possible 
maximum number of directed cycles in an alternating directed cycle decom- 
position of the symmetric difference of the edge sets. Using these definitions 
we have the following result on the minimum directed swap-distance: 

Theorem 5.4. Let dd be a directed degree sequence with realizations G\ 
and G 2 ■ Then 

dist^Gx, G 2 ) = H'iGt, G 2 ) - maxC d (Gi, G 2 ). (5.1) 

Proof. We can prove that the LHS is at most as big as the RHS by recalling 
the proof from Theorem 3.6 for bipartite graphs taking into consideration 
the previous observations. And, by Lemma 5.2 we do not create any new 
triangular Gg cycle. 

Consider now the equivalent form of inequality (3.2). Fix a suitable 
swap sequence of minimal weighted length and assume that this contains the 
smallest possible number of triangular Gg-swaps among all such sequences. 
Denote by -B(Gi) = Hq, Hi, . . . , -fffc-i> Hk = -B(G 2 ) the bipartite graph 
sequence which consists of the consecutive realizations generated by this 
swap sequence (then for each i = 0, . . . , k — 1, the graphs Hi and Hi+\ are 
in swap-distance 1 or 2, depending whether the swap was a simple G4-swap 
or a triangular Gg-swap). It is clear that 
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(A) any consecutive swap subsequence from Hi to Hj must be also a mini- 
mum one, furthermore it must contain the smallest possible number of 
triangular C^-swaps among all such subsequences. 

For each i we use the following notation 

\ := E(H )AE(Hi). 

We will now revisit the proof of Theorem 3.6 (ii). We will try to mimic 
that proof. Whenever the swap under study is a triangular Cg-swap, then 
it cannot share a non-chord with any earlier generated cycle. (This comes 
form a simple straightforward generalization of Lemma 5.3.) 

Whenever the swap under investigation is a regular C^-swap, then we 
can proceed as in the proof of Theorem 3.6 (ii). And this concludes the 
proof of Theorem 5.4. □ 

We can strengthen LaMar's recent result ([14]): 

Theorem 5.5. If we modify the definition of weighted swap-distance between 
two realizations of a directed degree sequence such that any length 6 circuit 
can be swapped in one step, and their weights are 2, then Theorem 5.4 still 
holds, and there exists a shortest swap sequence between any two realizations, 
G\ and G 2 , of the same directed degree sequence which contains only C4- 
swaps and triangular C^-swaps. 

The novelty here is that there always exists shortest swap sequence which 
is conform with LaMar's idea. 

Proof. Any length 6 circuit which is not a triangular Cq cycle falls in the 
above-discussed Case 1 or 2, and can be transformed with two C4-swaps, 
whose summed weight is also 2. □ 

Theorem 4.1 transforms to the following. Let us given a directed degree 
sequence dd = ((df , . . . , (d~[ , ... ,d~)). Let 

m = and m* = [min [d~^ , n — df) + min [d~ , n — d~)] 

where m* is an upper bound on the number of edges in a balanced red-blue 
graph associated with two realizations G\ and G2. Then 

Theorem 5.6. For all pairs of realizations Gi,(?2 of the same degree se- 
quence dd we have 



dist^i, G 2 ) < H'(Gi, G 2 ) ■ (l - 



< 111" I - - — ) < m ( 1 - - 

2 2n ~ \ n 
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To finish our paper, recall that Greenhill proved in [6, Lemma 2.2] that 
in case of regular degree sequences any two directed realizations can be 
transformed to each other using C4-swaps only. (In this case she calls these 
C4-swaps switches.) A similar notion was studied by Berger and Miiller- 
Hannemann (see [1]). However consider the following example: let dd be a 
two-regular directed degree sequence with six vertices. In Figure 4 we show 
two realizations: The symmetric difference of these two realizations is one 




(a) Realization Gi (b) Realization Gi 



Figure 4: Two realizations with a triangular Cq as symmetric difference 

triangular Cq cycle. Therefore the swap sequence generated by Greenhill 
cannot be a minimal one. (Of course in her application this was never 
a requirement: she uses the above mentioned result successfully to prove 
rapid mixing time of the sampling algorithm for regular directed bipartite 
graphs.) 
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